fastdup By Visual-Layer

fastdup is a tool for gaining insights from large image/video collections. It can find anomalies, duplicate and near duplicate images/videos, clusters of similarity, learn the normal behavior and temporal interactions between images/videos. It can be used for smart subsampling of a higher quality dataset, outlier removal, and novelty detection for new information to be sent for tagging. Just 2 lines of code to get you started:
todo: fastdup clip fastdup is: Unsupervised: fits any dataset
Scalable : handles 400M images on a single machine
Efficient: works on CPU only Low Cost: can process 12M images on a $1 cloud machine budget works on CPU only From the authors of GraphLab and Turi Create.

Quick installation

Python 3.7, 3.8, 3.9
Supported OS: Ubuntu 20.04, Ubuntu 18.04, Debian 10, Mac OSX M1, Mac OSX Intel, Windows 10 Server.

# upgrade pip to its latest version
python3.XX -m pip install -U pip
# install fastdup
python3.XX -m pip install fastdup

Where XX is your python version.
For Windows, CentOS 7.X, RedHat 4.8 and other older Linux see our Insallation instructions.

What’s new in V1.0?

Better support for labels
Better galleries
A new Python API

from fastdup.engine import Fastdup

fd = Fastdup()
fd.run()

# Use .summary() to get a quick overview of your data:
fd.summary()

# Now you have access to all analysis and galleries using the Fastdup object:
similarity_df = fd.similarity()
outliers_df = fd.outliers()

Running the code

Existing API is fully supported

import fastdup
fastdup.run(input_dir="/path/to/your/folder", work_dir='out', nearest_neighbors_k=5, turi_param='ccthreshold=0.96')    #main running function.
fastdup.create_duplicates_gallery('out/similarity.csv', save_path='.')     #create a visual gallery of found duplicates
fastdup.create_outliers_gallery('out/outliers.csv',   save_path='.')       #create a visual gallery of anomalies
fastdup.create_components_gallery('out', save_path='.')                    #create visualiaiton of connected components
fastdup.create_stats_gallery('out', save_path='.', metric='blur')          #create visualization of images stastics (for example blur)
fastdup.create_similarity_gallery('out', save_path='.',get_label_func=lambda x: x.split('/')[-2])     #create visualization of top_k similar images assuming data have labels which are in the folder name
fastdup.create_aspect_ratio_gallery('out', save_path='.')                  #create aspect ratio gallery

Getting started examples

Detailed instructions

User community contributions

Stroke AIS Data
Tire Data
Butterfly Mimics
Drugs and Vitamins
Plastic Bottles
Micro Organisms
PCB Boards
ZebraFish
Whats the difference

Support and feature requests

Join our Slack channel

fastdup enterprise edition

Visual Layer

About us

Danny Bickson, Amir Alush

Introduction

Quick Start

Explore & Search

Collab & Downstream

Models & Enrichment

Advanced Creation & Management

Integrations

Troubleshooting

What's New in Fastdup V1.0

fastdup By Visual-Layer

Quick installation

What’s new in V1.0?

Running the code

Getting started examples

Detailed instructions

User community contributions

Support and feature requests

fastdup enterprise edition

About us

Introduction

Quick Start

Explore & Search

Collab & Downstream

Models & Enrichment

Advanced Creation & Management

Integrations

Troubleshooting

​fastdup By Visual-Layer

​Quick installation

​What’s new in V1.0?

​Running the code

​Getting started examples

​Detailed instructions

​User community contributions

​Support and feature requests

​fastdup enterprise edition

​About us

fastdup By Visual-Layer

Quick installation

What’s new in V1.0?

Running the code

Getting started examples

Detailed instructions

User community contributions

Support and feature requests

fastdup enterprise edition

About us